R-Tree:

What is an R-Tree?

R-Trees are data structures used to efficiently store and search for rectangles or spatial objects (like bounding boxes). Each node in the tree stores a bounding box that covers its children. Leaf nodes contain the actual objects (e.g., points, rectangles), and non-leaf nodes contain bounding boxes that group those objects together. When inserting a new object, the tree finds the node whose box needs the least enlargement to include the new one. If a node gets too full, it splits. Where It's Used Geographic Information Systems (GIS) (e.g., finding map features in a region) Game engines (to detect objects in view or near each other) Databases (spatial indexing of coordinates) Computer graphics (fast collision and intersection detection) Strengths Very efficient for range queries (e.g., "find all objects in this area") Works well for rectangular or spatial data Can handle overlapping or irregular shapes Supports dynamic insertions and deletions Weaknesses Overlapping bounding boxes in nodes can reduce performance Balancing the tree isn't always perfect (it’s not strictly height-balanced like AVL trees) Can be less efficient than KD-Trees for exact nearest-neighbor searches

R-Tree Operations

Construction:

  1. Start with empty tree
  2. Insert entries one by one using insertion algorithm

Insertion:

  1. Choose Leaf: Traverse to node needing least bounding box expansion
  2. Insert Entry: Add object and update bounding boxes
  3. Split If Needed: On overflow:
    • Apply quadratic or linear split
    • Propagate adjustments upward

Search:

  1. Range Query: Visit nodes overlapping the query region
  2. Nearest Neighbor: Use best-first traversal with distance pruning

Deletion:

  1. Find and remove target entry
  2. Condense tree and reinsert orphaned nodes

Complexities:

OperationAverageWorst
Bulk LoadingO(n log n)O(n²)
InsertionO(log n)O(n)
DeletionO(log n)O(n)
Range QueryO(logn + m)O(n)
Nearest NeighborO(log n)O(n)
k-NN QueryO(k log n)O(kn)

Bulk Loading:

R-Trees can be built efficiently using bulk-loading algorithms like Sort-Tile-Recursive (STR), which sort and group spatial objects to minimize overlap. This process reduces tree height and clustering issues, and typically takes O(n log n) time due to sorting and hierarchical insertion.

Insertion/Deletion:

Standard insertions and deletions both traverse from the root to a leaf, making decisions at each level. These operations take O(log n) on average since the tree remains balanced and shallow due to grouping of multiple entries per node.

k-NN Search:

To find the k closest points, R-Trees use a priority queue to explore nodes ordered by distance. This allows pruning of far regions early, leading to O(k log n) on average. As with NN, poor clustering or overlapping rectangles can worsen performance.

Range Query:

Range queries recursively check which bounding rectangles intersect the query window. The tree’s height is log n, and only overlapping branches are explored. With m results returned, complexity is O(log n + m) in most cases.

Node Capacity [Minimum = 2] =
Deletes all points in the tree.
Generates a new tree with 10 random points.
Adds 20 randomly generated points to the current tree.
After clicking this, you can pick any spot in the canvas as a query point, and its nearest 5 neighbors will be highlighted from it.